i-Code: An Integrative and Composable Multimodal Learning Framework
نویسندگان
چکیده
Human intelligence is multimodal; we integrate visual, linguistic, and acoustic signals to maintain a holistic worldview. Most current pretraining methods, however, are limited one or two modalities. We present i-Code, self-supervised framework where users may flexibly combine the modalities of vision, speech, language into unified general-purpose vector representations. In this framework, data from each modality first given pretrained single-modality encoders. The encoder outputs then integrated with multimodal fusion network, which uses novel merge- co-attention mechanisms effectively information different entire system end-to-end new objectives including masked unit modeling cross-modality contrastive learning. Unlike previous research using only video for pretraining, i-Code can dynamically process single, dual, triple-modality during training inference, projecting combinations single representation space. Experimental results demonstrate how outperform state-of-the-art techniques on five understanding tasks benchmarks, improving by as much 11% demonstrating power integrative pretraining.
منابع مشابه
Nursing leadership competency learning- an integrative review
Background: In the last decade literature, inquiries and reports into the short comings in health services have highlighted the vital role of leadership in clinical practice and the impact on patient care and effective workplace culture. Given the important role of nurses as the largest therapeutic group in health systems, the question is how nurses acquire clinical leadership ...
متن کاملResource complementarity and type of interorganizational learning: an integrative framework
Purpose – Resource-and knowledge-based scholars claim that firms should focus on the creation and accumulation of knowledge-based competencies in order to yield long-term survival. Several authors have emphasized the added value of alliance relationships in the knowledge development and learning processes of organizations. The knowledge-based view of interfirm alliance has recently drawn increa...
متن کاملOrganizational learning and capabilities : An integrative conceptual framework
Organizational learning (Bontis, Crossan and Hulland, 2002) and capabilities (Barney, 1991) have been argued to increase performance. Recently, connections have been established between organizational learning and capabilities. On the one hand, learning has been considered as a capability (Hult and Ketchen, 2001; Goh, 2003; Henri, 2006), leading to the idea of “learning capability”. On the othe...
متن کاملA Multimodal Interaction Framework for Blended Learning
Humans interact with each other by utilizing the five basic senses as input modalities, whereas sounds, gestures, facial expressions etc. are utilized as output modalities. Multimodal interaction is also used between humans and their surrounding environment, although enhanced with further senses such as equilibrioception and the sense of balance. Computer interfaces that are considered as a dif...
متن کاملA Composable Reflective Communication Framework
A communication service is described by an abstract protocol that specifies a set of roles to be played by participants, the requirements on role players and installation information. The (dynamic) installation of a protocol requires no knowledge or modification of the component itself; it is sufficient to encapsulate each component in a layer that implements the role it is to play, affecting o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Proceedings of the ... AAAI Conference on Artificial Intelligence
سال: 2023
ISSN: ['2159-5399', '2374-3468']
DOI: https://doi.org/10.1609/aaai.v37i9.26290